Goto

Collaborating Authors

 score and electronic health record


Machine Learning Models Rank Predictive Risks for Alzheimer's Disease - Neuroscience News

#artificialintelligence

Summary: Using machine learning technology, researchers concluded the risk of genetic risk may outweigh age as a predictor of whether a person will develop Alzheimer's disease. Once adults reach age 65, the threshold age for the onset of Alzheimer's disease, the extent of their genetic risk may outweigh age as a predictor of whether they will develop the fatal brain disorder, a new study suggests. The study, published recently in the journal Scientific Reports, is the first to construct machine learning models with genetic risk scores, non-genetic information and electronic health record data from nearly half a million individuals to rank risk factors in order of how strong their association is with eventual development of Alzheimer's disease. Researchers used the models to rank predictive risk factors for two populations from the UK Biobank: White individuals aged 40 and older, and a subset of those adults who were 65 or older. Results showed that age – which constitutes one-third of total risk by age 85, according to the Alzheimer's Association – was the biggest risk factor for Alzheimer's in the entire population, but for the older adults, genetic risk as determined by a polygenic risk score was more predictive.


Explainable machine learning aggregates polygenic risk scores and electronic health records for Alzheimer's disease prediction

#artificialintelligence

Alzheimer’s disease (AD) is the most common late-onset neurodegenerative disorder. Identifying individuals at increased risk of developing AD is important for early intervention. Using data from the Alzheimer Disease Genetics Consortium, we constructed polygenic risk scores (PRSs) for AD and age-at-onset (AAO) of AD for the UK Biobank participants. We then built machine learning (ML) models for predicting development of AD, and explored feature importance among PRSs, conventional risk factors, and ICD-10 codes from electronic health records, a total of > 11,000 features using the UK Biobank dataset. We used eXtreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP), which provided superior ML performance as well as aided ML model explanation. For participants age 40 and older, the area under the curve for AD was 0.88. For subjects of age 65 and older (late-onset AD), PRSs were the most important predictors. This is the first observation that PRSs constructed from the AD risk and AAO play more important roles than age in predicting AD. The ML model also identified important predictors from EHR, including urinary tract infection, syncope and collapse, chest pain, disorientation and hypercholesterolemia, for developing AD. Our ML model improved the accuracy of AD risk prediction by efficiently exploring numerous predictors and identified novel feature patterns.